Overview

Brought to you by YData

Dataset statistics

Number of variables16
Number of observations336713
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory41.1 MiB
Average record size in memory128.0 B

Variable types

Text4
Categorical5
Numeric6
DateTime1

Alerts

age is highly overall correlated with birthHigh correlation
birth is highly overall correlated with ageHigh correlation
category is highly overall correlated with price and 1 other fieldsHigh correlation
month is highly overall correlated with yearHigh correlation
price is highly overall correlated with category and 1 other fieldsHigh correlation
sub_category is highly overall correlated with category and 1 other fieldsHigh correlation
year is highly overall correlated with monthHigh correlation

Reproduction

Analysis started2024-10-16 10:19:32.007552
Analysis finished2024-10-16 10:20:41.983183
Duration1 minute and 9.98 seconds
Software versionydata-profiling vv4.10.0
Download configurationconfig.json

Variables

Distinct3264
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:42.637643image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.5317615
Min length3

Characters and Unicode

Total characters1862616
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)< 0.1%

Sample

1st row0_1483
2nd row2_226
3rd row1_374
4th row0_2186
5th row0_1351
ValueCountFrequency (%)
1_369 1081
 
0.3%
1_417 1062
 
0.3%
1_498 1036
 
0.3%
1_414 1027
 
0.3%
1_425 1013
 
0.3%
1_398 952
 
0.3%
1_406 946
 
0.3%
1_413 944
 
0.3%
1_403 939
 
0.3%
1_407 933
 
0.3%
Other values (3254) 326780
97.1%
2024-10-16T12:20:43.653578image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 382993
20.6%
_ 336713
18.1%
0 301062
16.2%
2 142873
 
7.7%
4 138052
 
7.4%
3 132907
 
7.1%
5 105868
 
5.7%
6 101227
 
5.4%
8 75015
 
4.0%
7 74341
 
4.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1862616
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 382993
20.6%
_ 336713
18.1%
0 301062
16.2%
2 142873
 
7.7%
4 138052
 
7.4%
3 132907
 
7.1%
5 105868
 
5.7%
6 101227
 
5.4%
8 75015
 
4.0%
7 74341
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1862616
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 382993
20.6%
_ 336713
18.1%
0 301062
16.2%
2 142873
 
7.7%
4 138052
 
7.4%
3 132907
 
7.1%
5 105868
 
5.7%
6 101227
 
5.4%
8 75015
 
4.0%
7 74341
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1862616
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 382993
20.6%
_ 336713
18.1%
0 301062
16.2%
2 142873
 
7.7%
4 138052
 
7.4%
3 132907
 
7.1%
5 105868
 
5.7%
6 101227
 
5.4%
8 75015
 
4.0%
7 74341
 
4.0%

year
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2021
277846 
2022
58867 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1346852
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2022
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 277846
82.5%
2022 58867
 
17.5%

Length

2024-10-16T12:20:43.869988image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-16T12:20:44.046211image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 277846
82.5%
2022 58867
 
17.5%

Most occurring characters

ValueCountFrequency (%)
2 732293
54.4%
0 336713
25.0%
1 277846
 
20.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1346852
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 732293
54.4%
0 336713
25.0%
1 277846
 
20.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1346852
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 732293
54.4%
0 336713
25.0%
1 277846
 
20.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1346852
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 732293
54.4%
0 336713
25.0%
1 277846
 
20.6%

month
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
September
33254 
December
32417 
February
29556 
January
29311 
March
28559 
Other values (7)
183616 

Length

Max length9
Median length7
Mean length6.256631
Min length3

Characters and Unicode

Total characters2106689
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowApril
2nd rowFebruary
3rd rowSeptember
4th rowOctober
5th rowJuly

Common Values

ValueCountFrequency (%)
September 33254
9.9%
December 32417
9.6%
February 29556
8.8%
January 29311
8.7%
March 28559
8.5%
April 28401
8.4%
November 28267
8.4%
May 28237
8.4%
June 26812
8.0%
August 25610
7.6%
Other values (2) 46289
13.7%

Length

2024-10-16T12:20:44.712642image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
september 33254
9.9%
december 32417
9.6%
february 29556
8.8%
january 29311
8.7%
march 28559
8.5%
april 28401
8.4%
november 28267
8.4%
may 28237
8.4%
june 26812
8.0%
august 25610
7.6%
Other values (2) 46289
13.7%

Most occurring characters

ValueCountFrequency (%)
e 331492
15.7%
r 260898
12.4%
u 161611
 
7.7%
b 145071
 
6.9%
a 144974
 
6.9%
y 111816
 
5.3%
m 93938
 
4.5%
c 82553
 
3.9%
J 80835
 
3.8%
t 80441
 
3.8%
Other values (16) 613060
29.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2106689
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 331492
15.7%
r 260898
12.4%
u 161611
 
7.7%
b 145071
 
6.9%
a 144974
 
6.9%
y 111816
 
5.3%
m 93938
 
4.5%
c 82553
 
3.9%
J 80835
 
3.8%
t 80441
 
3.8%
Other values (16) 613060
29.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2106689
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 331492
15.7%
r 260898
12.4%
u 161611
 
7.7%
b 145071
 
6.9%
a 144974
 
6.9%
y 111816
 
5.3%
m 93938
 
4.5%
c 82553
 
3.9%
J 80835
 
3.8%
t 80441
 
3.8%
Other values (16) 613060
29.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2106689
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 331492
15.7%
r 260898
12.4%
u 161611
 
7.7%
b 145071
 
6.9%
a 144974
 
6.9%
y 111816
 
5.3%
m 93938
 
4.5%
c 82553
 
3.9%
J 80835
 
3.8%
t 80441
 
3.8%
Other values (16) 613060
29.1%

day
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.801787
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:44.905749image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.8184205
Coefficient of variation (CV)0.55806478
Kurtosis-1.1961889
Mean15.801787
Median Absolute Deviation (MAD)8
Skewness-0.0024937699
Sum5320667
Variance77.764541
MonotonicityNot monotonic
2024-10-16T12:20:45.110711image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
28 11528
 
3.4%
1 11293
 
3.4%
16 11253
 
3.3%
26 11165
 
3.3%
27 11161
 
3.3%
23 11135
 
3.3%
17 11125
 
3.3%
24 11111
 
3.3%
14 11099
 
3.3%
11 11095
 
3.3%
Other values (21) 224748
66.7%
ValueCountFrequency (%)
1 11293
3.4%
2 10957
3.3%
3 10936
3.2%
4 10663
3.2%
5 10839
3.2%
6 11016
3.3%
7 10938
3.2%
8 11095
3.3%
9 10945
3.3%
10 10965
3.3%
ValueCountFrequency (%)
31 6402
1.9%
30 10571
3.1%
29 10579
3.1%
28 11528
3.4%
27 11161
3.3%
26 11165
3.3%
25 11028
3.3%
24 11111
3.3%
23 11135
3.3%
22 11019
3.3%

time
Date

Distinct84576
Distinct (%)25.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
Minimum2024-10-16 00:00:00
Maximum2024-10-16 23:59:59
2024-10-16T12:20:45.364753image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:45.602217image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct169173
Distinct (%)50.2%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:46.257245image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.344014
Min length3

Characters and Unicode

Total characters2472825
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80318 ?
Unique (%)23.9%

Sample

1st rows_18746
2nd rows_159142
3rd rows_94290
4th rows_105936
5th rows_63642
ValueCountFrequency (%)
s_118668 14
 
< 0.1%
s_96857 13
 
< 0.1%
s_21005 13
 
< 0.1%
s_168560 12
 
< 0.1%
s_93892 12
 
< 0.1%
s_136877 11
 
< 0.1%
s_163784 11
 
< 0.1%
s_155361 11
 
< 0.1%
s_3533 11
 
< 0.1%
s_112803 11
 
< 0.1%
Other values (169163) 336594
> 99.9%
2024-10-16T12:20:47.213154image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 336713
13.6%
_ 336713
13.6%
1 313672
12.7%
2 175267
7.1%
3 175028
7.1%
4 174458
7.1%
6 173234
7.0%
5 172531
7.0%
7 158229
6.4%
8 154935
6.3%
Other values (2) 302045
12.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2472825
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
s 336713
13.6%
_ 336713
13.6%
1 313672
12.7%
2 175267
7.1%
3 175028
7.1%
4 174458
7.1%
6 173234
7.0%
5 172531
7.0%
7 158229
6.4%
8 154935
6.3%
Other values (2) 302045
12.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2472825
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
s 336713
13.6%
_ 336713
13.6%
1 313672
12.7%
2 175267
7.1%
3 175028
7.1%
4 174458
7.1%
6 173234
7.0%
5 172531
7.0%
7 158229
6.4%
8 154935
6.3%
Other values (2) 302045
12.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2472825
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
s 336713
13.6%
_ 336713
13.6%
1 313672
12.7%
2 175267
7.1%
3 175028
7.1%
4 174458
7.1%
6 173234
7.0%
5 172531
7.0%
7 158229
6.4%
8 154935
6.3%
Other values (2) 302045
12.2%
Distinct8600
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:47.843153image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.881745
Min length3

Characters and Unicode

Total characters1980460
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)< 0.1%

Sample

1st rowc_4450
2nd rowc_277
3rd rowc_4270
4th rowc_4597
5th rowc_1242
ValueCountFrequency (%)
c_1609 12855
 
3.8%
c_6714 4471
 
1.3%
c_3454 3273
 
1.0%
c_4958 2562
 
0.8%
c_2140 195
 
0.1%
c_7959 195
 
0.1%
c_2595 193
 
0.1%
c_8026 192
 
0.1%
c_3725 190
 
0.1%
c_8392 189
 
0.1%
Other values (8590) 312398
92.8%
2024-10-16T12:20:48.675694image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 336713
17.0%
_ 336713
17.0%
1 149042
7.5%
4 145000
7.3%
6 144674
7.3%
5 138632
7.0%
7 133436
 
6.7%
3 133191
 
6.7%
2 133142
 
6.7%
8 116476
 
5.9%
Other values (2) 213441
10.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1980460
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
c 336713
17.0%
_ 336713
17.0%
1 149042
7.5%
4 145000
7.3%
6 144674
7.3%
5 138632
7.0%
7 133436
 
6.7%
3 133191
 
6.7%
2 133142
 
6.7%
8 116476
 
5.9%
Other values (2) 213441
10.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1980460
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
c 336713
17.0%
_ 336713
17.0%
1 149042
7.5%
4 145000
7.3%
6 144674
7.3%
5 138632
7.0%
7 133436
 
6.7%
3 133191
 
6.7%
2 133142
 
6.7%
8 116476
 
5.9%
Other values (2) 213441
10.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1980460
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
c 336713
17.0%
_ 336713
17.0%
1 149042
7.5%
4 145000
7.3%
6 144674
7.3%
5 138632
7.0%
7 133436
 
6.7%
3 133191
 
6.7%
2 133142
 
6.7%
8 116476
 
5.9%
Other values (2) 213441
10.8%

quantity_sold
Real number (ℝ)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4987452
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:48.870676image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.8746987
Coefficient of variation (CV)0.52279175
Kurtosis-1.2271436
Mean5.4987452
Median Absolute Deviation (MAD)3
Skewness0.0014324165
Sum1851499
Variance8.2638923
MonotonicityNot monotonic
2024-10-16T12:20:49.043573image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
7 33873
10.1%
4 33855
10.1%
3 33770
10.0%
2 33765
10.0%
10 33763
10.0%
9 33725
10.0%
1 33695
10.0%
8 33543
10.0%
5 33410
9.9%
6 33314
9.9%
ValueCountFrequency (%)
1 33695
10.0%
2 33765
10.0%
3 33770
10.0%
4 33855
10.1%
5 33410
9.9%
6 33314
9.9%
7 33873
10.1%
8 33543
10.0%
9 33725
10.0%
10 33763
10.0%
ValueCountFrequency (%)
10 33763
10.0%
9 33725
10.0%
8 33543
10.0%
7 33873
10.1%
6 33314
9.9%
5 33410
9.9%
4 33855
10.1%
3 33770
10.0%
2 33765
10.0%
1 33695
10.0%
Distinct3264
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:49.675626image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.5317615
Min length3

Characters and Unicode

Total characters1862616
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)< 0.1%

Sample

1st row0_1483
2nd row2_226
3rd row1_374
4th row0_2186
5th row0_1351
ValueCountFrequency (%)
1_369 1081
 
0.3%
1_417 1062
 
0.3%
1_498 1036
 
0.3%
1_414 1027
 
0.3%
1_425 1013
 
0.3%
1_398 952
 
0.3%
1_406 946
 
0.3%
1_413 944
 
0.3%
1_403 939
 
0.3%
1_407 933
 
0.3%
Other values (3254) 326780
97.1%
2024-10-16T12:20:50.550682image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 382993
20.6%
_ 336713
18.1%
0 301062
16.2%
2 142873
 
7.7%
4 138052
 
7.4%
3 132907
 
7.1%
5 105868
 
5.7%
6 101227
 
5.4%
8 75015
 
4.0%
7 74341
 
4.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1862616
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 382993
20.6%
_ 336713
18.1%
0 301062
16.2%
2 142873
 
7.7%
4 138052
 
7.4%
3 132907
 
7.1%
5 105868
 
5.7%
6 101227
 
5.4%
8 75015
 
4.0%
7 74341
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1862616
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 382993
20.6%
_ 336713
18.1%
0 301062
16.2%
2 142873
 
7.7%
4 138052
 
7.4%
3 132907
 
7.1%
5 105868
 
5.7%
6 101227
 
5.4%
8 75015
 
4.0%
7 74341
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1862616
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 382993
20.6%
_ 336713
18.1%
0 301062
16.2%
2 142873
 
7.7%
4 138052
 
7.4%
3 132907
 
7.1%
5 105868
 
5.7%
6 101227
 
5.4%
8 75015
 
4.0%
7 74341
 
4.0%

category
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
Vêtements
209426 
Accessoires
109735 
Montres
 
17552

Length

Max length11
Median length9
Mean length9.5475464
Min length7

Characters and Unicode

Total characters3214783
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVêtements
2nd rowMontres
3rd rowAccessoires
4th rowVêtements
5th rowVêtements

Common Values

ValueCountFrequency (%)
Vêtements 209426
62.2%
Accessoires 109735
32.6%
Montres 17552
 
5.2%

Length

2024-10-16T12:20:50.801179image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-16T12:20:50.991584image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
vêtements 209426
62.2%
accessoires 109735
32.6%
montres 17552
 
5.2%

Most occurring characters

ValueCountFrequency (%)
e 655874
20.4%
s 556183
17.3%
t 436404
13.6%
n 226978
 
7.1%
c 219470
 
6.8%
m 209426
 
6.5%
ê 209426
 
6.5%
V 209426
 
6.5%
o 127287
 
4.0%
r 127287
 
4.0%
Other values (3) 237022
 
7.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3214783
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 655874
20.4%
s 556183
17.3%
t 436404
13.6%
n 226978
 
7.1%
c 219470
 
6.8%
m 209426
 
6.5%
ê 209426
 
6.5%
V 209426
 
6.5%
o 127287
 
4.0%
r 127287
 
4.0%
Other values (3) 237022
 
7.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3214783
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 655874
20.4%
s 556183
17.3%
t 436404
13.6%
n 226978
 
7.1%
c 219470
 
6.8%
m 209426
 
6.5%
ê 209426
 
6.5%
V 209426
 
6.5%
o 127287
 
4.0%
r 127287
 
4.0%
Other values (3) 237022
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3214783
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 655874
20.4%
s 556183
17.3%
t 436404
13.6%
n 226978
 
7.1%
c 219470
 
6.8%
m 209426
 
6.5%
ê 209426
 
6.5%
V 209426
 
6.5%
o 127287
 
4.0%
r 127287
 
4.0%
Other values (3) 237022
 
7.4%

sub_category
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
T-shirts
209426 
Sacs à main
109735 
Montres connectées
 
17552

Length

Max length18
Median length8
Mean length9.4989769
Min length8

Characters and Unicode

Total characters3198429
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowT-shirts
2nd rowMontres connectées
3rd rowSacs à main
4th rowT-shirts
5th rowT-shirts

Common Values

ValueCountFrequency (%)
T-shirts 209426
62.2%
Sacs à main 109735
32.6%
Montres connectées 17552
 
5.2%

Length

2024-10-16T12:20:51.183597image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-16T12:20:51.361598image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
t-shirts 209426
36.5%
sacs 109735
19.1%
à 109735
19.1%
main 109735
19.1%
montres 17552
 
3.1%
connectées 17552
 
3.1%

Most occurring characters

ValueCountFrequency (%)
s 563691
17.6%
i 319161
10.0%
t 244530
7.6%
237022
 
7.4%
r 226978
 
7.1%
a 219470
 
6.9%
- 209426
 
6.5%
T 209426
 
6.5%
h 209426
 
6.5%
n 162391
 
5.1%
Other values (8) 596908
18.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3198429
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
s 563691
17.6%
i 319161
10.0%
t 244530
7.6%
237022
 
7.4%
r 226978
 
7.1%
a 219470
 
6.9%
- 209426
 
6.5%
T 209426
 
6.5%
h 209426
 
6.5%
n 162391
 
5.1%
Other values (8) 596908
18.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3198429
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
s 563691
17.6%
i 319161
10.0%
t 244530
7.6%
237022
 
7.4%
r 226978
 
7.1%
a 219470
 
6.9%
- 209426
 
6.5%
T 209426
 
6.5%
h 209426
 
6.5%
n 162391
 
5.1%
Other values (8) 596908
18.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3198429
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
s 563691
17.6%
i 319161
10.0%
t 244530
7.6%
237022
 
7.4%
r 226978
 
7.1%
a 219470
 
6.9%
- 209426
 
6.5%
T 209426
 
6.5%
h 209426
 
6.5%
n 162391
 
5.1%
Other values (8) 596908
18.7%

price
Real number (ℝ)

HIGH CORRELATION 

Distinct1442
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.215189
Minimum0.62
Maximum300
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:51.563682image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0.62
5-th percentile4
Q18.61
median13.9
Q318.99
95-th percentile47.22
Maximum300
Range299.38
Interquartile range (IQR)10.38

Descriptive statistics

Standard deviation17.855445
Coefficient of variation (CV)1.0371914
Kurtosis45.425205
Mean17.215189
Median Absolute Deviation (MAD)5.09
Skewness5.4791964
Sum5796577.8
Variance318.81693
MonotonicityNot monotonic
2024-10-16T12:20:51.782645image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15.99 10563
 
3.1%
4.99 9407
 
2.8%
10.99 8949
 
2.7%
3.99 8713
 
2.6%
5.99 8275
 
2.5%
11.99 8246
 
2.4%
14.99 8041
 
2.4%
8.99 7785
 
2.3%
17.99 7572
 
2.2%
12.99 7461
 
2.2%
Other values (1432) 251701
74.8%
ValueCountFrequency (%)
0.62 14
 
< 0.1%
0.66 8
 
< 0.1%
0.77 3
 
< 0.1%
0.81 7
 
< 0.1%
0.88 3
 
< 0.1%
0.92 3
 
< 0.1%
0.93 4
 
< 0.1%
0.97 13
 
< 0.1%
0.98 4
 
< 0.1%
0.99 66
< 0.1%
ValueCountFrequency (%)
300 8
 
< 0.1%
254.44 4
 
< 0.1%
247.22 66
< 0.1%
236.99 94
< 0.1%
233.54 3
 
< 0.1%
231.99 3
 
< 0.1%
230.04 111
< 0.1%
228.11 11
 
< 0.1%
225.17 36
 
< 0.1%
222.97 3
 
< 0.1%

stock_quantity
Real number (ℝ)

Distinct99
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.901587
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:52.015928image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q127
median49
Q376
95-th percentile95
Maximum99
Range98
Interquartile range (IQR)49

Descriptive statistics

Standard deviation28.628154
Coefficient of variation (CV)0.56242164
Kurtosis-1.169745
Mean50.901587
Median Absolute Deviation (MAD)24
Skewness-0.012410363
Sum17139226
Variance819.57121
MonotonicityNot monotonic
2024-10-16T12:20:52.247968image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43 6473
 
1.9%
96 5848
 
1.7%
15 5715
 
1.7%
46 5711
 
1.7%
44 5702
 
1.7%
34 5516
 
1.6%
58 5483
 
1.6%
48 5444
 
1.6%
61 5401
 
1.6%
65 5144
 
1.5%
Other values (89) 280276
83.2%
ValueCountFrequency (%)
1 3996
1.2%
2 3342
1.0%
3 3241
1.0%
4 2905
0.9%
5 2065
0.6%
6 2276
0.7%
7 2637
0.8%
8 3721
1.1%
9 4614
1.4%
10 3512
1.0%
ValueCountFrequency (%)
99 3177
0.9%
98 2959
0.9%
97 4381
1.3%
96 5848
1.7%
95 4985
1.5%
94 4106
1.2%
93 3477
1.0%
92 4819
1.4%
91 1724
 
0.5%
90 4296
1.3%

sex
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
m
169198 
f
167515 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters336713
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowf
2nd rowf
3rd rowf
4th rowm
5th rowf

Common Values

ValueCountFrequency (%)
m 169198
50.2%
f 167515
49.8%

Length

2024-10-16T12:20:52.498178image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-16T12:20:52.683419image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
m 169198
50.2%
f 167515
49.8%

Most occurring characters

ValueCountFrequency (%)
m 169198
50.2%
f 167515
49.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 336713
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
m 169198
50.2%
f 167515
49.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 336713
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
m 169198
50.2%
f 167515
49.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 336713
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
m 169198
50.2%
f 167515
49.8%

birth
Real number (ℝ)

HIGH CORRELATION 

Distinct76
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1977.8236
Minimum1929
Maximum2004
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:52.890004image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1929
5-th percentile1952
Q11971
median1980
Q31987
95-th percentile1999
Maximum2004
Range75
Interquartile range (IQR)16

Descriptive statistics

Standard deviation13.524433
Coefficient of variation (CV)0.0068380383
Kurtosis0.45233507
Mean1977.8236
Median Absolute Deviation (MAD)8
Skewness-0.58041737
Sum6.6595891 × 108
Variance182.9103
MonotonicityNot monotonic
2024-10-16T12:20:53.110997image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1980 23878
 
7.1%
1979 12418
 
3.7%
1988 12406
 
3.7%
1978 12270
 
3.6%
1986 11818
 
3.5%
1983 10630
 
3.2%
1984 10232
 
3.0%
1982 10102
 
3.0%
1977 10017
 
3.0%
1987 9732
 
2.9%
Other values (66) 213210
63.3%
ValueCountFrequency (%)
1929 86
 
< 0.1%
1930 115
 
< 0.1%
1931 84
 
< 0.1%
1932 159
 
< 0.1%
1933 163
 
< 0.1%
1934 274
0.1%
1935 143
 
< 0.1%
1936 387
0.1%
1937 454
0.1%
1938 431
0.1%
ValueCountFrequency (%)
2004 7348
2.2%
2003 2182
 
0.6%
2002 2223
 
0.7%
2001 2032
 
0.6%
2000 2174
 
0.6%
1999 4999
1.5%
1998 2356
 
0.7%
1997 2306
 
0.7%
1996 2943
0.9%
1995 2781
 
0.8%

age
Real number (ℝ)

HIGH CORRELATION 

Distinct76
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.176432
Minimum20
Maximum95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 MiB
2024-10-16T12:20:53.327851image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile25
Q137
median44
Q353
95-th percentile72
Maximum95
Range75
Interquartile range (IQR)16

Descriptive statistics

Standard deviation13.524433
Coefficient of variation (CV)0.29288606
Kurtosis0.45233507
Mean46.176432
Median Absolute Deviation (MAD)8
Skewness0.58041737
Sum15548205
Variance182.9103
MonotonicityNot monotonic
2024-10-16T12:20:53.552452image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
44 23878
 
7.1%
45 12418
 
3.7%
36 12406
 
3.7%
46 12270
 
3.6%
38 11818
 
3.5%
41 10630
 
3.2%
40 10232
 
3.0%
42 10102
 
3.0%
47 10017
 
3.0%
37 9732
 
2.9%
Other values (66) 213210
63.3%
ValueCountFrequency (%)
20 7348
2.2%
21 2182
 
0.6%
22 2223
 
0.7%
23 2032
 
0.6%
24 2174
 
0.6%
25 4999
1.5%
26 2356
 
0.7%
27 2306
 
0.7%
28 2943
0.9%
29 2781
 
0.8%
ValueCountFrequency (%)
95 86
 
< 0.1%
94 115
 
< 0.1%
93 84
 
< 0.1%
92 159
 
< 0.1%
91 163
 
< 0.1%
90 274
0.1%
89 143
 
< 0.1%
88 387
0.1%
87 454
0.1%
86 431
0.1%

Interactions

2024-10-16T12:20:38.404286image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:30.930429image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:32.490447image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:33.916346image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:35.454051image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:36.892413image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:38.646403image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:31.183557image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:32.719914image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:34.154335image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:35.691891image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:37.142265image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:38.885465image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:31.413242image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:32.941884image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:34.392313image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:35.915076image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:37.387326image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:39.141094image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:31.772445image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:33.181170image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:34.672557image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:36.157033image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:37.640475image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:39.459272image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:32.005441image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:33.423246image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:34.943491image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:36.399068image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:37.893906image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:39.706357image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:32.246384image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:33.674854image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:35.206096image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:36.644351image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-16T12:20:38.151477image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-10-16T12:20:53.720074image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
agebirthcategorydaymonthpricequantity_soldsexstock_quantitysub_categoryyear
age1.000-1.0000.415-0.0020.024-0.054-0.0020.080-0.0000.4150.023
birth-1.0001.0000.4200.0020.0240.0540.0020.0970.0000.4200.022
category0.4150.4201.0000.0090.1380.6430.0020.0150.0761.0000.066
day-0.0020.0020.0091.0000.0390.0020.0020.0100.0000.0090.060
month0.0240.0240.1380.0391.0000.0220.0000.0080.0070.1381.000
price-0.0540.0540.6430.0020.0221.0000.0030.010-0.0690.6430.010
quantity_sold-0.0020.0020.0020.0020.0000.0031.0000.000-0.0000.0020.000
sex0.0800.0970.0150.0100.0080.0100.0001.0000.0000.0150.005
stock_quantity-0.0000.0000.0760.0000.007-0.069-0.0000.0001.0000.0760.008
sub_category0.4150.4201.0000.0090.1380.6430.0020.0150.0761.0000.066
year0.0230.0220.0660.0601.0000.0100.0000.0050.0080.0661.000

Missing values

2024-10-16T12:20:40.046126image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-10-16T12:20:40.879083image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

id_prodyearmonthdaytimesession_idclient_idquantity_soldproduct_idcategorysub_categorypricestock_quantitysexbirthage
00_14832021April1018:37:28s_18746c_445050_1483VêtementsT-shirts4.9948f197747
12_2262022February301:55:53s_159142c_27762_226MontresMontres connectées65.7563f200024
21_3742021September2315:13:46s_94290c_427031_374AccessoiresSacs à main10.7197f197945
30_21862021October1703:27:18s_105936c_459780_2186VêtementsT-shirts4.2057m196361
40_13512021July1720:34:25s_63642c_124220_1351VêtementsT-shirts8.9959f198044
50_10852021September1505:47:48s_90139c_252670_1085VêtementsT-shirts3.9943f198242
60_15082021August2905:39:01s_82100c_579970_1508VêtementsT-shirts8.0372f196262
70_16272021September1914:54:52s_92294c_142250_1627VêtementsT-shirts3.9980f198044
80_14692022February307:13:22s_159252c_220760_1469VêtementsT-shirts14.9966f197054
90_14532022February2609:03:10s_171098c_543350_1453VêtementsT-shirts7.9989f198143
id_prodyearmonthdaytimesession_idclient_idquantity_soldproduct_idcategorysub_categorypricestock_quantitysexbirthage
3367030_1532021October1712:47:22s_106129c_68220_153VêtementsT-shirts3.996f197450
3367041_2822021June112:14:28s_42561c_294581_282AccessoiresSacs à main23.2049f196856
3367051_4132022January1013:47:23s_147795c_370611_413AccessoiresSacs à main17.9948m198737
3367060_14752021July2213:47:39s_65686c_560770_1475VêtementsT-shirts11.9976m195074
3367071_4982022February301:39:15s_159138c_185751_498AccessoiresSacs à main23.3718f199034
3367081_6712021May2812:35:46s_40720c_345411_671AccessoiresSacs à main31.9913m196955
3367090_7592021June1900:19:23s_50568c_6268100_759VêtementsT-shirts22.9965m199133
3367100_12562021March1617:31:59s_7219c_413770_1256VêtementsT-shirts11.0313f196856
3367112_2272021October3016:50:15s_112349c_582_227MontresMontres connectées50.9994f199430
3367120_14172021June2614:38:19s_54117c_671430_1417VêtementsT-shirts17.9938f196856